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HISTORICAL INTRODUCTION | 


For over 150 years anomalies of color vision have remained obscure phe- 
nomena. The fault does not lie with the experimenters who have worked or 
are working in the field, but rather, with the elusive nature of the subject matter 
itself. There is still no theory extant which will cover all of the known facts in 
the field of color vision, although most authorities (40) agree that a three-color 
theory is most acceptable. With this absence of basic agreement as to hy- 
pothesis serving as a starting point, the confusion existing in the field at present 
becomes more intelligible. 

The color-vision test to be considered in this report is the Nela Test, a wool 
classification test developed by Knight Dunlap in the Nela Laboratories in 1923, 
improved and refined by Day. Haupt (17), Scheidt (44) and others have used 
the test and found it to be practical and to fulfill the demands of a wide variety 
of usages. In its original form it was composed of twenty-two test items, each 
consisting of three small skeins of yarn, but later was expanded by Haupt to 
47 items, the additions being further combinations of the original colored yarns. 
Scheidt (44), after an extensive application of the test, suggested further revision 
and shortening. Although much more complete as a diagnostic instrument in 
its revised form, the test proved unwieldy, due to its length, and not suited to 
the demands of large-scale testing programs. It is the purpose of the present 
research to effect still more compression of the test, in order that it will be placed 
in the category of the ‘“‘adequate, practical” color-vision tests, and still not 
sacrifice discrimination or validity. Statistics obtained with a revised form of the 
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test (24 items) will be presented together with data on such relevant variables 
as age and sex. 


TEST MATERIALS EMPLOYED 


The Nela Test for Color Vision employs dyed worsteds arranged in 47 items; 
each item made up of three skeins of yarn. The individual skeins of yarn are ~ 
approximately three inches long and one inch wide, three of these skeins con- — 
stituting a test item. The colors comprising each triplet are so chosen that one — 
of the outside two skeins, in each case, is more similar in color (hue) to the center 
skein, than is the other outside skein. The particular color combinations used — 
were selected on the basis of actual confusion in various “types” of color de- | 
ficiency. The task demanded by the test is simple, requiring only that the sub- 
ject indicate which of the outside two skeins of a given item is most like the 
center skein, in terms of color only; saturation and intensity being disregarded 
as much as is possible in making the judgment. In standardizing the original 
test, persons making errors on the test were checked with a spectroscopic test. | 

The fact that no color naming is required in the test made it possible to in- — 
clude mixed, “‘off-shades” of color, highly diagnostic in function. The problem | 
is fairly simple, as may be certified by the wide use of the test with groups of all | 
ages and backgrounds. It has been found to be a very efficient and practical . 
test, simple to administer. 

The test minimizes the usual objection to worsted tests by the arrangement of | 
triplets and by the technique of administration. It does not entirely eliminate — 
the possibility that the subject may make errors in terms of brightness or in- | 
tensity, but triplets confusing to the ‘‘normal”’ eye in terms of brightness are not — 
of sufficient number in the test to put the subject into the class of ‘‘defective”’ 
because of brightness errors alone. The danger of such errors occurring can be — 
reduced considerably by careful, preliminary instructions to the subject. Also, | 
it should be remembered, that persons with a slight color deficiency are likely to — 
make such errors, and thus, the diagnostic range of a test is increased (40). 

Besides giving a numerical score as to the number of errors made on the test, — 
the test situation allows subsequent diagnosis of the color difficulty in terms of © 
the actual items missed. Typical patterns or constellations of errors will be 7 
noted. Comparisons with other color-vision tests, and right-eye, left-eye com- — 
parisons, indicate a highly consistent and valid test situation. | 

To facilitate the use of the test on large, widely divergent population samples 


length, Scheidt (44) tested a large group of subjects and conducted an analysis of — 
the test. He found that fifteen of the forty-seven items could probably be elimi- — 
nated from the test, without seriously affecting its diagnostic value. Nine of 
these triplets to be eliminated were found in Test I and six in Test II, leaving a © 
total of 32 items in the test as a whole. 


THE PROBLEM 


The present study was undertaken primarily to effect a revision of the Nela | 
Test for Color Vision. The Nela Test offers a unique and diagnostic test situa- | 
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ion, and could be used to advantage to supplement and replace more popular 
tests of ‘‘color-blindness.”’ The principal deterrent to wide use of the Nela 
Test has been, as previously mentioned, its length and the consequent time 
factor in administration. Even though Scheidt’s suggested revisions be put in 
force, and the test shortened to thirty-two items, the test remains discourag- 
ingly long, particularly to the commercial-industrial color-vision investigator. 
With this fact in mind, the number of items was cut down to twenty-four, a 
number which furnishes an adequate sample of the entire test, and still keeps the 
test-time reasonably brief. 

During the use of the 24-item revision, considerable data were gathered show- 
ing the relations of such factors as age, sex, race and smoking habits to the range 
of color anomalies. These data, and their inter-comparisons, are included in 
the report in the hope that additional light may be focused on the problems in the 
field. At the same time that the Nela Revision was administered, an eleven 
plate edition of the Ishihara Test, and a wool sorting test of the Holmgren 
‘type, were administered for the purposes of check and comparison. 
| The terms “‘color-blindness” and ‘color defective” have been loosely and er- 
roneously applied to a wide range of persons exhibiting deficient color vision of 
whatever degree. This faulty designation has led to considerable confusion in 
the field. Obviously the terms ‘‘color-blind” and “‘color defective” have specific 
meaning only in terms of the purpose of the test, and the test used. For this 
reason, data will be presented which should help to clarify the use of the terms 
and add significance to the ‘“‘established’”’ norms (40-A). 

The final aspect of the problem to be considered here is the selection of various 
degrees and types of color deficient cases, subjecting these groups to further tests 
and measures in the hope that additional facts may be forthcoming which will 
clarify certain of the variables which at present remain obscure. One of the 
most important of these is the investigation of possible remedial work which 
might be done. Preliminary work with Vitamin A as a possible remedial agent 
will be presented. 


APPARATUS AND TEST MATERIALS 


The selection of the 24 triplets to be used in the present revision was arbitrarily 
determined by two methods. First, items were eliminated on the basis of the 
suggestions made by Scheidt in his study (44). Secondly, the test in its original 
form was administered to groups of college students and to applicants for Los 
Angeles City Civil Service positions. There were 100 college men, 150 Civil 
Service applicant men, 100 college women and 150 Civil Service applicant women. 
The ‘‘Civil Service” group was made up of applicants for the positions of Police- 
man, Policewoman, Clerk-Typist and Laborer. 

No detailed analysis was attempted since the revision was intended to be only 
a rough, working revision which could be subjected to analysis later in the light 
of results obtained. ‘The items finally included in the revision were those items 
showing a high “diagnostic” error score for the group tested and checking with 
Scheidt’s more extensive analysis. In a few instances, items were included in 
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the revision which Scheidt had suggested eliminating. Such items were included | 
only where the total error on the item, obtained on the group tested, was found 
to be in serious disagreement with Scheidt’s study. These changes were justified | 
on the basis of the differences in the groups used for subjects in the two test 
situations. The final list of items, in terms of the original triplet numbers, is: 
From Test I: 1, 2,3, 6, 14, 15, 16, 17, 18, 20, 22, 23 and 24; from Test Il; 2,38 
6, 7,8, 10, 11, 12, 18, 19 and 22. Included in this final list are all but two of the 
most diagnostic items (those items having the highest total error scores), accord- 
ing to Scheidt’s study. , 

These twenty-four triplets were mounted in the order shown on the inside of a 
wooden case, 24 inches square. The surface was painted a dull black in order to 
reduce the glare, reflection and contrast effect. The items were spaced approxi- 
mately two inches apart to insure reduced confusion and contrast. The case 
was hinged at the back to make it possible to lay the entire test out flat before 
the subject. To protect the wools from dust, fading and soiling when not in use 
the case could be shut and latched, protecting the wools inside. 

The test was presented to the subject under “‘noon-daylight”’ illumination. 
To insure constancy and standard illumination conditions, special 500 watt lamps 
- provided with ‘‘noon-daylight”’ filters (28-A) were obtained. ‘The source of 
illumination was kept at a constant distance from the test materials so as to 
insure adequate, standard and equally distributed illumination at every point 
in the test. A brass pointer was provided with which the subject indicated 
his choice on each item to protect the wools from the inevitable discoloration 
which results from perspiration and soiling when wools are handled by large 
numbers of people. Subjects were prevented from getting closer than 24 inches. 
to the test to further standardize conditions and insure the use of foveal vision. 
No effort was made to get right and left eye comparisons since Haupt and Scheidt 
have gathered rather complete data on these comparisons previously (17, 44). 
Every subject reported on here was given two tests over the same items, a first 
test and a subsequent retest. 

The Ishihara test was administered under the same conditions and under the 
same illumination according to directions given in the manual. The form used 
consists of eleven plates. In four of the plates the ‘‘color-blind”’ subject sees one 
set of numbers, the ‘‘normal”’ another; in four of the plates the “normal” sees a 
number but the “‘color-blind”’ sees no number; in two of the plates the ‘“‘normal”’ 
sees no number clearly but the “‘color-blind”’ does see a number clearly; and in the 
final plate (missed by no subject in this group), a subject deficient in ‘“‘blue- 
yellow” sees nothing. Each subject was given both test and retest on the 
Ishihara. 

The Holmgren-type wool sorting test used was presented under the same con- 
ditions of illumination as the other tests. Seven ‘test’? colors were arranged 
along the top of a large board and the subject was required to sort out 70 test 
items under the correct colors. The ‘‘test items” were small skeins of yarn, 
mounted on cards to reduce soiling, and numbered on the back to facilitate scor- 
ing. Errors were counted in terms of the number of “test items” incorrectly 
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classified. Because of the long period of time required to sort the large number 
of items this test was given only to those subjects with error scores of over four 
on the Nela or over five on the Ishihara. It was included as an additional check 
or means of validation, and also made it possible to pick out with ease the ‘“‘pat- 
tern” of color deficiency exhibited. 

At present, no detailed examination of the critical subjects has been attempted 
since the spectroscopic apparatus required is not available. However, further 
investigation of the critical cases is planned for the immediate future. 


THE SUBJECTS 


The subjects tested in the present study were all students in beginning Psy- 
chology taken from seven classes at the University of California at Los Angeles 
during the fall and spring semesters of 1940-41. In order to secure a repre- 
sentative sample within the groups tested, and to prevent persons having knowl- 
edge of their color-vision defects from evading the examination, every member 
of every class included was tested. ‘Too often, in the past, experimenters have 
based color-vision norms on samples which volunteered for examination. When 
this method is employed, persons who have reason to suspect that they may 
have a defect will often not volunteer, and thus the norms for the group are not 
valid. This criticism has particularly been leveled against female groups. Be- 
cause of the fact that color plays a much more significant role in the daily life 
of the woman, color ignorance and color defects are apt to be humiliating and 
hence more carefully masked than among men. It has been suspected, ac- 
cordingly, that women with defective color vision have often, in past test situa- 
tions, failed to appear for examination, thus “skewing” results, contributing to 
the production of much lower norms for the sex as a whole, when compared to 
men. It is not implied that this factor accounts for the entire difference values 
noted between the two sexes, but it is undoubtedly one factor. Another im- 
portant factor might be the relatively large amount of training and exposure in 
the field of color received by women, as compared to men. Women’s clothes 
are traditionally brighter and more varied in hue combinations, resulting in 
greater familiarity and more practice in color judgment, and thus a better chance 
to learn to recognize and use secondary cues. 

It will be noted by reference to the summary of subjects in table 2 that the 
797 subjects were divided into two groups, Group Jand Group II. The members 
of Group I took the test and retest with at least a week intervening between the 
two tests. The members of Group II took the test and retest during the same 
fifteen minute period. In the case of the retests of Group II careful provision 
was made so that the retest was administered by another examiner, not the same 
examiner who administered the first test. By this method it was hoped the 
objectivity of the examining procedure would be increased. 

The subjects were asked to indicate, on a form provided, such factors as racial 
background, hair and eye color, smoking habits (in Group II only), art training, 
age, complexion and sex. It was hoped that some correlation between the fac- 
tors of race, pigmentation, age, sex and color-vision efficiency might be forth- 


coming. Certain of these comparisons are included in this report. Member-, 
ship in each of these groups, according to the factors used, is indicated in table 2. 
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TABLE 2 
Summary sheet of subjects 


GROUP I 


GROUP II 


GROUPS I AND II 


PERCENTAGES 


FACTOR 
Male | F& | Total 
Unselected racial 
group 
TEV Ge se sn RS 1490 1198 {338 
2iOrientalte 85, Oe: 1 8 9 
SinclS ROW live seo a ed wear: 0 0 0 
AS OING ST OS oe Sela 1 1 2 
5. Jewish...... 
Cit Soe ee kee aE 
Tie SA ened Ge RTE 
OLA! shee dee el 2: 142 |207 (349 
Per cents 2 oh... 5 41% 59% 100% 
Age groups 
16 1 1 2 
1? Wem (ad 3 be 
18 30 | 64 | 94 
19 44 | 92 |136 
20 24 Oh et ao 
21 15 9 "| 24 
22 9 0 9 
23 3 0 3 
24 4 1 5 
25 = 0 3 
26 0 0 0 
27 2 0 2 
28 0 0 0 
29 0 0 0 
30 or over 2 2 4 
Smoking habits 
TESINeVeriie ca oie ote. 
2. Occasionally....... 
3. Moderately........ 
4, Heavy and regular. 
5. Light and regular. . 
6. ‘Formerly noes 


Male 


The groups were first introduced to the test situation by a class presentation — 


of sample triplets. 


The test procedure was described and specific instructions 


Fe- 
male 


Fe- 
Male hate Total 


per per per — 
cent | cent cent @ 
of- | of fe- | of @ 
male | male | total — 


252 |382 |634 
14 | 16 |-30 
0.1) aia 
Me Geeks 
63 | 57 {120 
1 Gee 
Oc Sages 

336 |461 |797 
42% 58% 100% 
O31 5 ae 
24 | 61 | 85 
83 |160 [243 
91 |148 [239 
45 | 54 | 99 
40 | 16 | 56 
1%) eee 
QO. 2) Hat 
ee me 
6.1 0 Sas 
3 Ma aang 
2 Liter eS 
i lane 
Omaeo ue 2 
Acai Gy AO 


75.0| 83.0.79 
4.01 3.013 
0.0, 0.1/0 
2.0| 0.4/1 

19.0| 12.5116. 
0.5 0.0.0 
0.0, 1.0 0 


42% 58%|100% © 


0.50.5 
13.010.0 
34.031.5 | 
32.031.0 
14.0, 12.0/12.0 


bo 

(=) 
SO SS =) © 
oon oo © 
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=) 
eS) Seas 
Or or or © OO 

=> 

ice) 
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56.0! 59.0/57.0 
24.0] 21.0/23.0 | 
7.0; 8.0) 8.0 9 
6.0} 3.0] 4.0 
6.0} 8.0) 7.0 | 
1.0} 1.0) 1.0 


TEST PROCEDURE 
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for taking the test were made explicit to the entire group. Each step in the 
test procedure was explained, until the process was clear to all members. 

The subjects then appeared individually for examination in air-conditioned, 
controlled-illumination test rooms. ‘The test instructions were repeated for each 
individual and he was required to make several trial selections on the sample 
skeins to make sure he understood the directions clearly. The subject was put 
through the Nela Revision, his score determined, and then he was put through 
the Ishihara. Later, if a large error score was found, he was put through the 
test of the Holmgren type. In no case, and at no stage of the test procedure 
prior to the retest, was the subject told his score. In the case of Group II, 
subjects were sent immediately, after the first test, to a second examiner to take 
the Nela Revision and the Ishihara once more. In the case of Group I, subjects 
were called back one week later for retests. 

At the retest the subjects were put through the tests in reverse order—start- 
ing with item * 24 and finishing with item 1 on the Nela, and starting with 
Plate 11 on the Ishihara, and finishing with Plate *1. This was done to 
break up any patterns of sequence which might have been learned from the first 
test; and also to prevent possible ‘‘memorization”’ of the item choices, if such 
a feat is possible. This technique also serves the function of presenting the 
items in different order, and thus, different color relations. If there was any 
extraneous factor, such as “‘contrast effect,’ operating in the situation which 
depended upon the order of item presentation, this factor would be negated by 
the retest procedure. For most of the data presented in the following section 
the scores used will be in terms of the retest scores, since these proved to be 
most reliable. 

Instructions given to the subject on administering the Nela Revision were as 
follows: They were first shown the sample skeins or triplets and told: ‘‘In the 
test situation, in each case look at the center skein of yarn (illustrating by point- 
ing to the sample triplets). Then I want you to tell me which of the outside 
two pieces of yarn (pointing to the skeins to the right and left of the center 
skein in one of the items) is most like, or is nearest to, the center skein in color. 
A very simple judment is all that is required. You are not required to analyze 
the color, just tell me which looks most like the center color in terms of color 
only (pointing to sample). Now, in this case, the skein on the right is the cor- 
rect choice—it looks more like the center skein in color. The skein on the left 
is closer in brightness, or intensity, but do not choose on that basis. Just make 
your choice on the basis of color (pointing to another item in the sample sheet). 
Now, which would be the correct choice in this case? (The subject indicates 
his choice, and the instruction continues through as many of the eight sample 
triplets as are needed to clarify the instructions.) (Hand the brass pointer to 
the subject.) Now, in the test, point to that skein on the right or left, which is 
most like the center skein, in terms of color only. Please begin with item num- 
ber one.” 

When the subject completed the test items, and the judgments were recorded, 
the examiner took the subject back over four or five test items, as a further 
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check. If the subject hesitated, or made errors on any items, these items were - 
always included in the subsequent ‘“‘spot-check.”? By this double-judgment 
system it was possible to make certain that errors made were not chance errors, - 
but apparently true or consistent. The same procedure was followed in the re-_ 
test situation, though instructions were abbreviated here to fit individual 
demands. \ / 


EXPERIMENTAL RESULTS 


In considering the numerical results of the experimental findings it is well to 
remember (see table 2) that the females tested constitute about 58 per cent of 
the total number and the males 42 per cent. Numerical data, when not pre-— 
sented in percentages, should be interpreted in the light of this difference in 
proportion of the sexes. ie 


TABLE 3 ‘ 
Distribution of error scores on Nela 


NUMBER OF ERRORS MADE ON NELA RETEST ; 
GROUP SEX eh lec) MEO ARS 
0 1 me a |s 6] 7 | 8 | 10 13 | 147).17 1 18 | 24 

I Male 82 27 ey 6 Oia Wee ee | 1, |. 25) ON) ORO Or eh 142 
Female 105 66 | 18 8 Oe 2 nhs Oecd | 010 | 0 |. 01-14 GLa 
TOta ei oe tk ene wk tn ee Se ee eee 349 
II Male 79 55.),.26 18 Qe bl Lede h Lo) 1 Og Oe 
Female 118 6/ loa hake 814710) 0 70) O47 Da Ose 254 
Total fore Ae Pre a ST RS 448 
I and II | Male 7 161 S25 | Sc ‘18 oe | Bales 3} OL Oa ae oe 
Female 223 | 181.) 55 | 26.) 14 +6) 14.2) 0) O84 1 Oa Gee 
Lotal soa 7 eet 384 ih.213 4-91 3) 0.67324) Sol 440 Say see eee | Lael 797 


The scores on the Nela Revision may be considered from two equally impor- 
tant points of view. They may be presented in terms of “‘total error scores,” 
or, in terms of the triplets missed (by item number) by each error group. Table 
3 presents the distribution of error scores in terms of the total number of errors 
made. Table 4 presents the same data, but in terms of the triplet numbers 
missed by each error group. 

An examination of table 3 shows that while the error range is extensive, in- 
cluding fifteen error groups from ‘‘zero” errors to twenty-one errors out of a 
possible twenty-four, the effective range does not exceed 5 errors. In terms of 
percentages, there are only 2.34 per cent of the men, and 1.25 per cent of the — 
women included in the group missing more than five triplets in the retest. Con- 
sequently, the effective range is restricted since there remain too few cases for 
valid comparison above this point. 
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It is interesting to note the initial differences manifested in Group I and 
Group II scores at the “zero error” point. Group I, the group which was re- 
tested with a week intervening between test and retest, has a much larger num- 

_ber of cases concentrated at the zero point, than does Group II (this group being 
retested immediately.) 

In terms of percentages, 57 per cent of the males in Group I made no errors, 
while only 40 per cent of the males in Group II made no errors; 50 per cent of 
the females in Group I made zero error scores, while only 46 per cent of the fe- 
males in Group II made zero error scores. If we assume that the samples con- 

form to normal selection laws, and that the two groups are alike in factors other 
than the period between test and retest, it would appear that the length of time 
intervening between the two tests had some slight effect in a positive direction 
on the scores obtained. However, the differences are more apparent than real, 
as can be seen by comparing the figures for the group missing one, the group 
missing two triplets, and so on. Here, the observed initial difference breaks 
down, and we find the error ratio between the two groups remains rather con- 
stant. One factor illustrated is the importance of test familiarity. The results 
as a whole show that the error scores become smaller as the familiarity with the 
test situation increases. In administering the test it was found that in some 
seases the subjects would complete the entire test before the true judgment situa- 
tion was clear to them. ‘This was true despite the fact that every precaution 
was taken to make certain the subject understood the procedure before he began 
the test. This was the basis for using the retest figures for most of the compari- 
sons. Further analysis of the total error scores for the groups will be undertaken 
when table 7 is examined. 

The data presented in table 4 show to what extent each triplet in the test 
contributed to the total error scores. The triplet numbers are listed along the 
X axis and the test, retest, and percentage figures along the Y axis. 

A comparison of the total figures for the test and retest errors on each triplet 

(in rows 5 and 8, respectively) is sufficient to substantiate the statement made 
above that the error scores tended to drop on the retest. In every case this 
reduction in number of errors has taken place. The same relation holds when 
the scores on the test items for the two sexes are considered separately. Both 
men and women made fewer errors on every triplet on the retest than on the 
first test. The differences are slightly less for the men than for the women. 
When the simplicity of the test situation is considered this significant drop in 
scores becomes even more difficult to understand. It should be remembered 
that none of the subjects were told their scores after the test, and had no way of 

knowing whether or not they had made errors, and yet the differences are quite 
noticeable. The difference cannot be explained in terms of learning the triplet 
combinations because even if this were possible, which is very unlikely, it would 
be caught in the ‘“‘spot-check”’ described earlier. The best answer seems to be 
in terms of familiarity with the test situation itself, and learning what type of 
judgment is actually expected in the test. Persons checked after the retest dis- 
played none of this variability. If the scores fluctuated without direction such 
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score changes could be explained in terms of the difficulty of the test or the lack 
of reliability, but this is not the case. The score changes are all in the same 
direction. 

The total error scores are given in the box on the right of the center rows. 
Nine hundred and three errors were made by the group of 797 on the retest. 
Since about 50 per cent of the entire group made no errors, this means that the 
903 errors were made by less than four hundred subjects, or an average of about 
2 errors per subject. Of further significance is the fact that the per cent of the 
total errors contributed by the females making mistakes, is about the same as 
that contributed by the males making mistakes. The males (42 per cent of the 
total group) made 486 of the errors, the females (58 per cent of the group) 
made 467 of the errors. 

The percentage of males making errors is greater than the percentage of fe- 
males making errors (table 7) at the lower end of the error range, which would 
indicate that when a female makes errors, she is likely to make more errors than 
the male, though not a great many more. 

The last three rows in table 4 show the per cent of error contributed to the 
total error score by each individual triplet. Analysis of these per cent-error 
contributions indicates a rather high degree of consistency of error for the two 
sexes on all the triplets in the test. There are four triplets, however, which 
seem to present a more difficult choice situation for the women than for the 
men. ‘These four triplets are numbers 5, 14, 15 and 16. All of these involve 
judgments of ‘‘pastel”’ shades, with the exception of triplet number five. In 
the case of this triplet, it was found by questioning the women making errors 
on the item, that contrast effect seemed to be the complicating factor. Almost 
without exception, the women missing this triplet reported projection of the color 
from the center skein onto the blue-gray skein at the left, calling it orchid, mak- 
ing it therefore the correct choice, but an error on the test. Men questioned at 
the same point did not seem to experience the difficulty in the same proportion. 
The difficulty with the pastel colors noted might possibly be explained in terms 
of more familiarity with such hues on the part of the women. The errors could 
be explained in some cases as arising from the fact that some female subjects 
imagined the color out of the test context, applied it to some extraneous situa- 
tion such as clothing, and evaluated or made their subsequent judgments in 
terms of “‘ike or dislike,” rather than in terms of similarity to the center color. 

Only two of the triplets have significantly high error scores for the male group, 
as compared to the female group. These are triplets number 8 and 21. On 
both of these triplets there is a relatively high percentage of error for both sexes. 
In both triplets the brightness is an important factor influencing the choice and 
probably accounts for many of the errors. Number 8 would seem to represent 
a typical protanope choice and error on number 21 would indicate a deuteranope 
choice. ‘The large error on these two triplets might be due, then, to a cumulation 
of such typical errors on each triplet. 

From the percentages in the last row (table 4) it is possible to pick out those 
triplets having the most, individual, diagnostic value. Representative of these 
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triplets are 5, 8, 10, 18, 15 and 21, which account for more than 50 per cent of | 
the total retest errors. Other nee while not quite so diagnostic, are faistel | 
significant. The diagnostic or error value of the individual triplets will receive - 
further consideration in the analysis of data from tables 8 through 12. It 
should be pointed out, however, that certain of the triplets seem to add little | 
to the diagnostic value of the test as a whole. | 

In table 5 a breakdown chart of all of the errors made on the test and retest — 
is presented in terms of sexes. No new evidence is offered in this table but 
additional evidence illustrating the consistency of errors is presented. The data 
are given in terms of the individual triplets missed. In this chart the scores | 
for both groups on test and retest are presented so that error scores per item may — 
be compared for both groups, and for both sexes. No percentage scores are 
given since they would represent spurious values in view of the variability of / 
response noted in comparing the test and retest error scores. Inter-comparisons 
between the male totals (in Row X + Y) and the female totals (in Row A + B) — 
bear out the triplet error scores presented in table 4. Likewise, the totals further 
substantiate the percentage figures listed in the last row of the previous table. / 
This would indicate that even though there is an appreciable drop in errors from — 
test to retest, the triplets causing the large error scores in the first test are the - 
same as those causing the errors in the retest. : 

There is another method of considering the error scores which is important © 
in the test analysis. It is of value to know the extent of the range of error on — 
the test, to determine what the function of the individual item is in the establish-_ 
ment of this range, and to find out which triplets are involved in the various 
error-score groups. In other words, it is important that we answer the ques-_ 
tions: Are there constellations of errors to be found in these various error groups? — 
Which specific items does a subject miss when he fails on two items, or three 
items or X items? For purposes of diagnosing the individual color defect this 
is one of the most important facts to be considered. The scores made on the > 
retest which are presented in Table 6 contain the answer to these questions. 

In this table the item numbers are placed along the X axis and the error- 
score groups along the Y axis. The table is simple to interpret. If one wishes 
to know which triplets are missed by the men making four errors, one traces — 
down the Y axis to ‘4, males,” then reads across the Row. In the case cited, 
it will be found that nearly all of the triplets were involved except triplets 6, 7, 
12, 16, 19, 20 and 24. Triplets 8, 13 and 21 are most productive of error at 
this error level. By the use of this table it is possible to isolate the significant 
constellations of triplets most often found linked together. With this material 
before one, it is then possible to analyze the component colors involved in the 
triplets and thus diagnose the individual color difficulty. If patterns could be 
established which were found to appear consistently after testing a large popu- 
lation sample, errors on individual items in the test could be invested with 
additional significance for diagnostic purposes. The results of the constellation 
analysis are presented in tables 8 through 12, and will be considered as soon as 
the comparison analysis is completed. 
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_ For purposes of comparison, the data presented in tables 7 and 7A are perhaps 
‘the most meaningful and significant of all the material gathered. These tables 


TABLE 5 
Summary chart of all Nela error 


ITEM NUMBERS (TRIPLETS) 


MALES 
Mase oe ies OM tOn tte 12 h3 ) Te TS Teitt, 18 $19) 120134 129193194 
First test Group I error 50| 31] 19] 26] 12) 33] 6] 14] 70] 41) 62) 22) 32) 62) 22) 98) 39) 4] 15} 11) 9} 73129|13] 7 
(x) Group II error | 49| 18| 16| 19] 3] 29} 1] 10] 54] 22) 43] 14) 13] 28} 25) 39) 13} 1) 7| 5) 2) 40) 6| 5) 5 
VAG) a 99} 49] 35] 45) 15] 62) 7| 24/124) 63)105| 36) 45) 90] 47/137) 52} 5) 23) 16/11/113/35}18|12 
Retest (y) | Group I error Silo eo OO aieti sor tal Ge sl tol Ol 25) Sie 1 4h 44 On Gi. Qi 
Group II error | 79) 7| 6] 7| 2) 9] 1} 2] 85) 18) 21} 4} 8| 21) 15) 28) 13) 1] 8) 4} 1).44) 4) 3] 4 
MGs ae ee A ed 5 2G ae oe ae 161) 10) 15} 10) 4] 15) 1) 5} 56} 22) 33} 10] 11] 34] 24) 53) 21) 2] 12) 8] 5} 65]10/] 5] 5 
| Missed on | Group I error | 45] 3} 5] 1! 1| 4! o| 1/13] 7 1} 2| 8} 5] 18] 5] 1| 2] 2] 2] 15] 4] of 1 
i both test} Group II error | 35) 5) 2] 3] 2} 6). O| O} 23) 8 21 Ae Sell LG lOle Gl abiomaieeLN i BSI OO 4 
and re- 
test 
“oats 2 Oy 7 aan SO Simitea mo mdOmeOian Le Ss6l ho-26) St “Si 24 TL S71 Qa Si S1248h 6) 21.5 
|x + y total Total male er- |260| 59; 50) 55) 19] 77! 8! 291180] 85)138] 46) 56/123] 71/197) 73) 7) 35) 24)16)178)45/23\17 
| ror ontestand : 
retest 
FEMALES 
| First test Group I error 31| 20) 15} 12) 6} 56) 11] 11] 52] 22] 41) 20) 20) 27) 20) 73) 34} 5| 14) 6} 9| 47/23/23] 7 
(a) Group II error | 73) 11] 12] 12} 1] 48} 3! 3) 53) 20) 45) 8| 7| 24) 20] 54) 22); 1) 4) 4] 5) 26) 6) 6) 2 
TOUR L A Rakes b ee eee oa ‘1104; 31) 27| 24) 7/104 14| 14105} 42).86) 28) 27| 51) 40127) 56) 6) 18 10/14} 73|29|29| 9 
Retest (B) | Group Terror |105} 4] 6) 2) 2] 19) 2] 1) 14) 12] 17; 5) 4) 12} 19) 32} 13) 2) 5| 5) 4) 14) 7) 4) 2 
| Croup iiserror (hla e5i) ail Ol witsel 1) 2] 86) 69) 2h) 4) 41°17) 13) 36127) 0) 5) dy 2).20) 2.6) 11 
TUB gk ee Ge ie ce as 228) 9] 13! 8] 4] 52) 38] 3] 50) 21) 38) 9] 8} 29) 32) 68] 40) 2) 10) 6] 6) 34) 9/10) 3 
Missed on | Group I error Uae tate enol net OW a edO ool cath WMomolaol: Ole Li Zi 216-8) S121 0 
bot test|) Group ILerror | 60) 4) 2) 1) 0} 23) 0| 6} 23) 7| 18) 0} 1) 10) 6) 25) 14) 0} 2) 0} O| 13) 1] 4) 1 
and re- 
test 
“TNE Oe + ea er BSG soles a bine2|. 2) wigs 14), 231 3h 4)-47)) Uh) 48) 20 AW 2 glean Ge 1 
Bt b Total! Total female er-|327| 40} 40) 32) 11/156] 17} 17/155} 63)124| 37] 35) 80) 72)195| 96) 8) 28) 16)20)107|38|39/12 
ror on test and i‘ 
retest 
TOTAL ERROR PER TRIPLET 
MALES AND FEMALES 
x + y plus| Total error on |587| 99} 90} 87| 30/233} 25) 46/335/148)262) 83) 91/203)143/892)169| 15) 63) 40/36/285)/83)62|29 
at+b test and re- 
test 


show the inter-comparisons between the male and female groups tested. The 
warning should be reiterated that there was no intention to derive numerical 
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scores which would give a definite answer as to just what constitutes “‘color- 


blindness.”’ 


ERROR 
GROUPS 
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14 
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18 
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Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


Male 
Female 


If a line must be drawn between “normal” and ‘‘abnormal” types 


TABLE 6 


Analysis of Nela error by error groups 
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ITEM OR TRIPLET NUMBER ON NELA 
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Dyan! a 2} 1 | 2) 0) 0; O| O} O| 2) 1) 1) 0 
Oo sip a 1) 0 1 | 0 | O} O} O} OF O | OF 1) O 
2; 2) 1) 0} O}.1) 1) 311} 0) OO) 2p 0RS rea 
1 SE OE SG 1 {1 |.0) 0) OP O05 siete 
04 LP 2.) 0.0) 2 0} O22) Tt ieee 


0 0 | 0} 0; 0} 0; 0 | O | OF OF O| O| O | O} OF O 
1/10 | 0} O}.1) 01170: 0) lL OF 1 te Ge 
1 1) 0) 1) 1) 1,0)1)0°) 0) lib Op ire 

0 0 0 Ore O10 0} 0 | 0} 0} O 


of color vision (and there appears to be no apparent reason for so doing), this 
line can only be drawn in terms of the particular job or situation for which the 
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| color-vision test is being given, or, in terms of a valid, standardized test with 
well established norms. In the case of monochromatic vision there is, of course, 
‘no problem involved. Aside from the case of monochromatic vision there is no 
objective means of arriving at a “‘color-vision index” or an “amount of defi- 
ciency” which may be meaningfully termed ‘‘color-blindness.”” The line drawn 


TABLE 7 
Total error scores on Nela 
MALES FEMALES MALES AND FEMALES 
NUMBER OF 
ITEMS WRONG Cumula- Cumula- Cumula- 
Frequency| Per cent tive Frequency| per cent tive Frequency! Per cent tive 
per cent per cent per cent 
; 0 161 48.0 52.0 223 48.4 51.6 384 48.2 51.8 
1 82 24.5 ZED Let 28.4 Zove 21S 26.8 25.0 
. 2 36 ths 16.0 55 11.9 @ ies’ 91 11.6 13.4 
. 33 24 7.0 9.0 26 Deo Has 50 6.3 Wak 
4 18 eo. oo. 14 3.0 LES oe 4.0 Sut 
5 2 0.5 3.0 6 13 ed 8 0 Bok 
6 33 0.66 2.34 ] 0.25 LR 25 4 0.5 1.6 
7 a 0.66 1.68 2 0.5 0.75 5 Oeb deed, 
8 2 ‘Os 1.18 0 0.0 0.75 Zz 0.25 0.85 
10 3 0.66 0.52 0 0.0 0.75 o 0.25 0.6 
13 0 0.0 0.52 li 0.25 ORD 1 OPt2 0.48 
14 1 0225 0.25 0 0.0 O85 A 0.12 0.36 
17 0 0.0 0.25 uf 0.25 0.25 1 Or12 0.24 
18 1 0.25 0.00 | 0 0.0 O325 1 0.12 0.12 
eat 0 0.0 0.00 hi 0.25 0.00 1 0.12 0.00 
Totals” |. 336 461 797 
Groups IJ and III. N = 797. 
TABLE 7-4 
Male-female comparison chi-square values 
| 2h sais 1 ERROR 2 ERRORS 3 ERRORS 4 ERRORS a neltaes N 
Ue ee 01 Ti 026 43 1.92 1.45 336 
a on sic orion .005 os pW st Jl .88 1.00 461 
Number of subjects. . 384 Pas 91 50 32 27 797 


X?=7.44:n=5;P = 15. 


in the past has been arbitrary, classifying about 33 per cent of the general male 
population as ‘‘color-blind”’ and about 14 per cent of the females as “‘color-blind.”’ 
Those persons with deficiencies in color vision not serious enough to be classed 
as “‘color-blind”’ and not small enough to be called ‘‘normal’’ have been classed 
as “‘color-weak.” There is a much larger percentage of the total population in 
the class of ‘‘color-weak,” than there is in the class of ‘‘color-blind.”” The color 
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discrimination problems of the ‘‘color-weak” group are, or may be, quite serious | 
because they may not be aware of their slight defect. For practical purposes, 
then, remedial efforts and detailed investigations of the ‘‘color-weak”’ group are 
very important. 

Inverse cumulative percentage figures based on the number of persons in each 
sex group making no errors, one error, two errors, and so forth, are presented in 
table 7, column 4. We find in column 4 that 52 per cent of the men and 51.6 
per cent of the women made more than “zero” errors. At this point the two 
groups are very close in total error scores. The groups start to diverge at one 
error, more women than men dropping out of the cumulative total at this level. 
The reason for this becomes apparent if we examine the figures for both males 
and females in column 3. This column gives the per cent of each sex belonging 
to the error-group. Eighty-two of the 336 men (24.5 per cent) missed one trip- 
let; 181, of the 461 women (28.4 per cent) missed one triplet. There is a differ- 
ence at this point of about 4 per cent between the sex groups. If the percentage 
figures are traced down the columns the differences become smaller. At the 
‘“‘10-error’’ point, the sex differences almost disappear. It should be remem- 
bered, however, that the number of cases gets appreciably smaller as the error 
score rises above 5, and therefore these relations are not as dependable as those 
at the upper end of the scale. 

If the accepted standards for ‘‘color-blindness” were applied to these data, 
the “line”? would be drawn at four errors. Three per cent of the men and 1.5 per 
cent of the women made more than four errors. It is felt, by the present in- 
vestigator, however, that this is too low an error score to classify as ‘‘color-blind,”’ 
unless the term means that the subjects above this line are apt to have some 
color difficulty. If this is the definition accepted then, the term ‘“‘color-weak”’ 
would be more truly descriptive. It will be noted that 27 of the 797 cases are 
above this line. Further discussion of these data will be presented later in the 
report in comparing the Nela Test with the pee Test and with the wool 
sorting test. 

In order to further analyze the apparent sex ginnike Chi-Square values 
(12) were worked out comparing the two sex groups at O-errors, l-error, 2- 
errors, 3-errors, 4-errors and more than 4 errors. These obtained values are 
presented in table 7-A. It will be noticed that the individual values obtained 
do not even reach unity, except in the case of 4 and over 4 error groups. The 
Chi-Square value obtained for the comparison was 7.44, which indicates that 
the male and female score differences could have been obtained by chance only 
15 times in one hundred. Stated differently, the chances are 85 out of 100 that 
the obtained differences were not due to chance. Although the P value should 
be 5 or less in order to insure significant relation, the value obtained does in- 
dicate that the differences may be significant, but are obscured due to sampling 
errors and the rather small number of subjects included in the group. It might 
be suspected that if the number of people tested had been two or three times 
as large, the significance of the difference would have been larger. 

A tetrachoric coefficient of correlation for the male and female error distribu- 
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tions was worked out as an additional statistical check. The low coefficient 
"(plus 0.24) obtained furnishes additional evidence supporting the Chi-Square 
: results. It indicates that the scores for the two sex groups, while proceeding 
in the same direction are not closely related. 


TABLE 8 


Frequency of error 
Constellation summary 


4 
5 
isa 
_X. Number of errors....| 0 i! Z 3 AVS) 6-7 W810i 1S W417 118 121 129 
Pee requency........... Poot Go WroOmira ssi reac) 40) See Oo Lote PL Als 
X-Y. Total number of : 
triplets missed........ 0 |213 |184 |150 |128 |40 |24 [35 |16 |30 {13 14 |17 |18 |21 | 903 
Nos 197. 
TABLE 9 
Most and least diagnostic triplets and constellations 
1 2 3 4 B 
4 as es PAIRS ' THREES FOURS See 
; aie DIAGNOSTIC 
1 1-21 10-15 5- 8-13 8-13-21 5- 8-13-15 3 
2 2-14 10-16 5- 8-15 8-14-15 5-13-15-16 4 
i 5- 8 13-15 5- 8-21 8-15-16 8-13-15-21 6 
8 5-13 13-21 5- 9-21 8-15-21 t 
9 5-16 14-15 5-10-15 8-16-18 Ale 
10 8-13 14-21 8-— 9-14 9-14-21 12 
13 _ 8-15 15-21 8-10-13 | 10-14-15 17 
14 8-16 15-16 8-10-14 13-15-21 20 
we FE 8-21 16-21 8-10-15 22 
16 9-14 8-13-15 \ 23 
18 9-21 8-13-16 | 24 
19 
21 
TABLE 10 
Frequency of solitary error on each triplet 
Triplet Number. ........... Pols | 3141] 5161/71 89 | 10) 11] 12] 13] 14] 15] 16] 17} 18] 19} 20) 21) 22} 23) 24 
Frequency of Solitary Error.| 4 | 5 | 2 | 0 | 19} 0 | 0 | 30) 5 | 25) 0) 1) 8 O2ohi 2a Ol Al ole Ole2o! 2) Lhe 0 
OO SHe En eee bic dick seas Clete LOre teiGG | Gy Beis (GvG kG Gy By GG) B )pG)B BB 


FURTHER REVISION 


The data presented in tables 8 through 12 should be considered as a unit. 
Evidence presented in these tables indicates that the Revised Nela used in this 
study can be further abbreviated without loss of discriminatory value. The 
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TABLE 11 
Error and per cent error on “‘G’’ items 
“G” ITEMS 4 
T0- 
SEX a es a TS a ees Tae STs si” LTS e Lia ie ALGGie a Ge 
1 Oa has 8 9 | 10.|°13.] 14 |. 45°] 46-) 18 eons 
Male Total error|10 115 |15 | 56 122: 183-184 (24 |538.|21 (2 se G ees 
Per cent | 2.7| 4.0) 4.0) 15.2) 6.0) 8.9) 9.1) 6.5) 14.4) 5.7) 3.3) 2.4/17.8 


error 


Female Total error] 9 {13 [52 5O N27) 438 129 aoe 68 {40 {10 6 (37 402 
Per cent | 2.2) 3.2)12.9| 12.3) 5.2) 9.5) 7.2! 8.2) -16.910.0) 2.5) 
error 


Male and | Total error/19 (28 |67 106 |43 |71 |63 (56 {121 (61 [22 14 99° 2770 
female | Per cent | 2.5) 3.6) 8.7) 13.8) 5.6) 9.2) 8.2) 7.3) 15.7) 7.9} 2.8) 1.8)12.9 


error 
TABLE 12 
Number of ‘‘B”’? members in error constellations 
xs Ls 1 2 3 4 
NO. OF *‘B’’ MEMBERS FOUND IN ERROR CONSTELLATIONS , ene TOTALS | DIFFER- ae : 
NO. OF ERRORS | TRIPLETS "BP & (eG heal aan ad 
1 | ow a eet Sl te, arog thao ee mete AE G ERROR) ue 
1 6 6 213 207 
2 111° ge 91 80 0 
3 12:1. ON. *0 12 50 38 0 
4 162" | 0 | 0 18 32 14 0 
5 Bl) Oe a) ne) 6 8 2 1 
6 Ov 81) |e] 0 ain aie) 4 4 0 0 
is QT eh 0 mee! er Ogee) 3 5 2 0 
8 OO. eho ane Ome 2a) 2 2 0 0 
10 Oper gas iO [Oho iO a 3 3 0 0 
13 0180) 0.) 20:40 1100) 200 1 1 0 0 
14 Oli Oo OcecOn nl Oucle 1 tee es 1 1 OF ao 
17 Goh eo OT CO Ost eked Gane sO al een nO 1 1 0 0 
18 Ot Ose On 0 OS Ol Gaon eel ae 1 1 0 0 
PAI Oy eQerO! st OME O84 0 ai aigelene a 2G 1 1 0 0 
Y 12 50 (Oe ns Yael eB Ie 8 A Sees ay JE Wee Mee dost Boon 70 413 343 3 
Error’ X2Y 1-530) d4 VO 8 Oe AS) OU aee Gan) 129 


86 per cent of total error on ‘‘G’’ items. 14 per cent of total error on ‘‘B”’ items. 


suggested revision apparent in these data would result in a final test composed 
of only 13 items. This reduced number is rather astonishing when it is remem- 
bered that the test originally consisted of 47 items. ; 


7 


NELA TEST OF COLOR VISION 19 


The data presented in table 8 are a condensation and recombination of figures _ 
in earlier tables and are presented here only for convenience in interpreting ad- 
jacent material. In all cases the figures represent the combined scores for both 


- groups and both sexes. The score groups are plotted along row X in table 8. 


The number of subjects falling in each score group is plotted along row Y. The 
lower row in this table represents the total number of triplet errors involved in 
each score group (i.e., 92 subjects missed two triplets, thus the total errors for 


- this group would be 92 X 2 or 184). 


The data in table 9 were obtained from analysis of the figures plotted in table 
6. This table presents the most diagnostic or discriminating triplets, or triplet 
combinations, found in the testing program. These triplet combinations having 
large error scores were obtained by plotting the scores for errors on a chart. 
Plotted on the chart, by means of tally marks, the triplet combinations could be 
located quite readily. The same procedure was carried out for triplets with very 
low error scores on the theory that errors on certain triplets might not be dis- 
criminatory by themselves, but might be highly discriminatory when considered 
in combination with other triplet errors. Constellations of errors involving two, 
three or four triplets were analyzed. Combinations of five triplets, since the 
frequency of such error-groups was relatively small, were disregarded. 

The most discriminatory single items, the ‘“G” or “Good” items, are listed 


in the first column in table 9. In this column are found all of those triplets 


which were missed a relatively large number of times, by those persons making 
only one error. The distribution of errors by triplet number for those individ- 
uals making only one error is presented in Table 10. In this table all of the 
items listed as belonging to the ‘“‘G” class have relatively large error scores 
against them, and all of these errors were made by subjects missing only one 
triplet. All of the items classed as “B” or “Bad” items have relatively low 
error scores for the group missing only one triplet. It will be noted further that 
of the 213 ‘‘solitary” errors made on the test, all but six of these “solitary”’ 
errors involve items belonging to the “G” or ‘‘Good” group of triplets. In 
other words, if a subject misses only one triplet on the test, this item is almost 
certain to be one of those listed in the ‘‘G”’ group. 

The group missing two items is considered next. In table 9 under column 2 
will be found the pairs of triplets which are found associated together most often 
in the item error scores for the group. The items missed by all those persons 
with error scores of two or over were analyzed to isolate these pairings. The 
twenty ‘‘most frequently paired” triplets are listed under column 2. It will be 
noted that in none of these combinations of two items does a member of the 
“RB” group of items appear. The reason for sorting out these paired combina- 
tions was primarily to find if characteristic patterns are to be found which in- 
clude ‘‘B” items, making the inclusion of the ‘““B” items mandatory. No such 
combinations were found of error frequency sufficient to warrant inclusion. 

The same treatment is given to the item combinations appearing in the 
columns three and four. Nineteen combinations of three items are found to 
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appear often enough to warrant inclusion here, and only three combinations of 
four items are found to be at all significant. No combinations of more than 
four items appear with sufficient frequency to warrant inclusion. These con- 
stellations of errors may be used to good advantage for diagnostic purposes as 
well as for the present purpose of pointing up the ‘‘G”’ item effectiveness. In 
no combination of triplet errors did-the ‘“‘B” items appear with significant fre- 
quency. That an item did not appear with a large score in the “solitary” error 
group is not sufficient reason for eliminating the item from the test since it 
might be of great significance in combination. This, however, did not prove to 
be the case. Neither singly nor in combination were the ‘‘B” items potent 
sources of error. Column B in table 9 lists the ‘‘B” items. 

Table 11 illustrates the effectiveness of the individual ‘‘G’’ items relative to 
the 770 errors attributable to these items alone. The figures are presented for 
both males and females indicating in each case what per cent of the total error 
is contributed by the individual item. The “diagnostic” triplets noted from 
the data presented in table 4 remain the same, nor does the relative error-effec- 
tiveness of the individual items change significantly. The relation of male error 
to female error remains the same, of course, since the data are from the same 
source, recombined. Items number 5, 8, and 15 retain their discriminative lead. 

In order to further clarify the role played by the ‘‘G” items in the test dis- 
crimination, the data are presented in more graphic form in table 12. In column 
X are the error groups and in the columns under ‘‘Z”’ are listed the actual number 
of ‘“‘B” items appearing in the ‘error constellations” considering every error 
made on the retest. The heavy, black line running down through the columns 
indicates the fifty per cent boundary. Any ‘‘B” items appearing to the right 
of this boundary line indicates that more than fifty per cent of the triplets 
missed by the error group were ‘‘B’”’ items. In only one instance is this true. 
This is in the case of the group missing five triplets. In this class one person 
missed five triplets, three of which were from the ‘‘B” group. The chart indi- 
cates that in no case did any subject make errors on the test without including 
in his “error constellation” triplets from the “G” group of items. The only 
exception to this is in the case of those missing one triplet. Six subjects made 
“solitary” errors on ‘‘B” triplets. Notice, also that these six solitary errors are 
spread over eleven triplets. 

Examination of column 1 reveals that 70 persons made errors on the “B” 
items. If it is assumed that the 6 solitary errors might be explained by chance 
factors, this leaves a total of 64 persons missing the ‘‘B” items. Contrast this 
total with the total of 343 persons making errors on the ‘“‘G”’ items, found in 
column 3. Since 413 persons made one or more errors on the test this means 
that only 16 per cent of the persons made errors on the ‘‘B”’ items, and all but 
6 of these subjects missed ‘“‘G” items at the same time. In other words, the 
‘“B” items do not appear to be necessary in the test, since in 99.5 per cent of 
the cases a ““G”’ item is missed along with the ‘‘B” item. 

In terms of total number of errors involved the results are the same. Nine 
hundred and three errors on all of the triplets are recorded, and of these, 129 of 
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the errors were on the “‘B”’ items, as may be seen in the bottom row of the figure. 
_ In terms of percentages this means that 14 per cent of the total error is attribut- 
able to “‘B”’ items, but in only 6 cases, or one-half of one per cent of the cases, 
was a “‘B” item missed without an accompanying error ona “G’’ item. The ‘‘B” 
items also fail to show up in any “‘error constellations” of their own. Groups 
of ‘‘B” items were not found of sufficiently high error frequency to attract atten- 
tion-in pairs, threes or fours. 

If the cases involving errors on triplet constellations made up of 50 per cent 
or more “‘B” item members are summated 84 such cases are found. Thus, only 
7 per cent of the error constellation made up of more than 50 per cent ‘‘B” 
items is found. 

In view of the data presented in the last four tables it would seem that con- 
siderable further revision of the present test is possible without seriously affect- 
ing the discriminating function. This further shortening would be desirable for 
several reasons. It would make the test easier to administer and shorten the 
total time necessary to make the required judgments. There is also a high 
probability that the shortening of the test would raise the reliability of the test 
as a whole, since subjects are inclined to make careless judgments if the task 
becomes tedious. It would simplify the problem of administering the test to 
large groups of subjects to obtain adequate norms and definitely put the test in 
the class of the ‘“‘practical” tests for color vision. 

There are, however, factors to be considered on the other side. In reducing 
the test from 47 to 13 items much of the diagnostic value for the individual 
tested is sacrificed. If only the most discriminating items remain in the test 
this leaves but little range for individual patterns of errors to manifest them- 
selves and if the test is to be used in diagnostic work, without additional or 
supplementary color-vision testing techniques, this becomes an important con- 
sideration. ‘The elimination of all of the “easy” items from the test presents 
another problem. It may be desirable to have certain “easy”’ items in a test 
of this type for padding, in order to instill initial confidence into the hesitant 
subject, particularly in the ‘‘border-line”’ cases. The answers to these problems 
can only be determined by further research using a test made up of only the 
“G”’ items and comparing the results with those obtained on the present revision. 


ISHIHARA COMPARISON 


Tables 13 and 14 present the results obtained from the Ishihara Color-Blind- 
ness Test administered in conjunction with the Nela Revision. In table 13 
frequency and per cent error data are presented. Two hundred and fifty-four 
males, 75 per cent, and 363 females, 78.7 per cent, made no errors on the Ishihara 
Test. According to the original norms for the test any person making any error 
on the test should be classed as “color-blind.”” Using this as a critical point, 
25 per cent of the men and 21 per cent of the women would be classed as ‘‘color- 
blind,” according to the Ishihara Test. The usual practice in interpreting the 
Ishihara scores on the short-form test is to allow one error, including all those 
making more than one error, in the ‘‘color-blind” group. An inspection of the 
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TABLE 13 
Error on Ishthara by total score 


MALE 


FEMALE 


Frequency I and II Per cent above Frequency I and II Per cent above 
0 254 25.0 363 oie 
1 34 42 52 TS 
~ 2 15 9.8 19 5.6 
3 6 SA 10 one 
4 1 Sue 7 1.9 
5 6 6.1 6 1.0 
6 1 5.8 1 O783- 
7 1 5.2 1 0.5 
8 1 5.0 1 0.25 
10 17. 0.0 1 0.0 
Tote ee ee ee ee 336 461 
TABLE 14 
Analysis of Ishthara plate error 
NUMBER ISHIHARA PLATE NUMBERS 
OF PLATES NO. OF 
MISSED— SEX | \ ’ ates 
ECL 
gems 1 2 3 4 5 6 7 8 10 14 
1 Male 0. 0 2 0 5 if 3 0 6 1-22 
Female 0 0 2 0 10 1 4 O° 8 1 33 
2 Male 0 0 1 1 3 1 0 0 P) 3. 19 
Female 0 0 2 1 2 3 0 5 4 12 
3 Male 0 0 1 2 2 1 1 0 1 ae 
Female 0 0 1 si 5 3 0 1 0 0 5 
4 Male 0 0 0 0 1 0 0 1 Lecit 
Female 0 0 2 2 5 | 1 0 4 2 > 
5 Male 0 0) i 2 4 0 2 0 4 4°13 
Female 0 0 1 54 o 3 3 I 2 0 4 
7 Male 0 0 @) 0 0 0 0 0 0 0 {0 
Female 0 0 0 1 1 1 1 1 0 T 1 
8 Male 0 0 i) i 1 1 1 1 1 ei 
Female 0 0 1 1 1 1 1 r) if 1 ie 
10 Male 0 11 11 11 11 11 11 11 11 je eo 3 
Female 0 0 0 0 0 0 0 0 0 0 0 
Totals sian 0 i. 26 25 53 26 Sh 16 45 | 29 50 
61 
Number of zero scores, 237. Group II subjects. N = 448. 
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‘percentages listed in columns 3 and 6, indicates that even according to this 
criterion 14.2 per cent of the men and 9.7 per cent of the women are to be classed 
as “color-blind.”” This would indicate that there is a sex difference of 4.5 per 
cent. This figure exceeds any published findings for similar groups and in the 
light of the results obtained on the Nela Revision would indicate that the Ishi- 
hara is testing some additional factor not tested by the Nela. In view of the 
relatively large percentage figures obtained here a serious doubt could be raised 
as to whether or not this additional factor apparently present is a color-vision 
factor, or something extraneous to the color vision of the subjects tested. One 
possible source of error, or explanation for the large score deviations from the 
‘norms, might be looked for in the administration of the test. Since the test was 
administered under standard, controlled conditions according to instructions in 
the handbook it does not seem possible that the large percentage figures can be 
explained in this manner (28-A). The most likely source of score deviations 
would seem to be in some extraneous factor present in the Ishihara Test itself. 
One such factor which has aroused the suspicions of previous investigators re- 
garding the Ishihara Test situation is visual acuity. Certainly astigmatic condi- 
tions, and other refractory disorders, would tend to complicate such test situa- 
tions since the subject is required to discern a figure pattern made up of small 
dots. In aneffort to check this hypothesis several of the subjects who habitually 
wore glasses correcting for refractory disorders were tested on the Ishihara both 
with and without their glasses. In most cases it was found that the error score 
was lower when the subject wore his or her glasses, which would seem to sub- 
stantiate the hypothesis. The evidence is, of course, incomplete and more re- 
search should be done to explore the factors at work. Cronstadt reports that 
experiments show 25 per cent error due to glasses on Ishihara tests, but no error 
difference due to glasses on wool tests (5-A). 

Although it is claimed that the brightness or intensity factor is negligible in 
the Ishihara Test since the intensity values for the colors are equated in the 
printing of the plates, this factor might be operating to some slight degree in 
the test used in the experiment. Variations in printing samples, even within 
the same print-lot, are notorious, and the task of equating brightness values in 
colored printing inks is almost impossible. This factor might account for a 
small percentage of the aberrant scores. As will be seen, there were several sub- 
jects making high Ishihara error scores who missed none of the Nela Triplets. 
This fact, coupled with the scores made on the wool sorting test which will be 
presented further along in the report, would seem to cast suspicion on the high 
error percentage obtained with the Ishihara. Stromberg’s findings with the 
Ishihara tend to bear out the present figures and indicate the test is too ‘“‘severe”’ 
(48-A). 

For possible use in detailed analysis of the scores on the Ishihara the data 
was charted’ and presented in table 14. The Ishihara Plate numbers can be 
read down the columns and the error groups along the rows. The question as 
to which items were missed by the subjects making one, two, three, etc., errors 
can be answered by tracing across the desired row. It will be noted that no 
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subjects missed plate 1 and very few made errors on plate 2. The data in this 
figure is compiled from the scores made by subjects in Group II. 

Of considerably more interest to us here is the comparison shown in table 15. — 
Here the total error scores on the Nela Revision are plotted along the X axis 
and the total error scores for the Ishihara down the Y axis. Of the 797 subjects — 
315 made no errors on either test. Most of the scores are grouped below the 
4 error mark on each test, resulting in a cluster of scores around the lower left 
corner. 

One very suggestive fact apparent in this presentation is that every subject 
making more than 5 errors on the Nela made errors on the Ishihara. This 
point, it will be remembered, was approximately the suggested dividing line or 
critical point for the Nela Revision. Scores on the Ishihara, then, would tend 
to substantiate this choice and also to validate the revision. ‘The same state-— 
ment cannot be made concerning the Ishihara scores. ‘Twelve of the subjects 


TABLE 15 
Nela-Ishihara comparison—total scores 


ERROR GROUPS ON Pree on ee ea ISHIHARA 
ISHIHARA ; ; TOTALS 
0 1 2 3 4 5 6 7 8 10) 23 | Oe Siete 

10 ae ad ae Poe 1 Lat 18 

8 ‘Togs | 2 

a 1 1 1 3 

6 1 1 2 

5 Zit Auk y) 1 12 
4 Dua # 1 | 1 Su 

3 Bt 8 aa wel 16 

2 i el a ao ah 1 34 

1 87 28" | 10) 6 £ 87 

0 815i 15601 71s) BO P20 7al 4 orl 615 

Nela totals........ 384° |°918 1 91 | 50 | 8218141512131) 1) ee 


making no errors on the Nela Revision are found with error scores on the Ishi- 
hara of 5 or more, a high error score on the Ishihara according to the norms for 
the test. Again it would seem that some other factor must be operating in the 
Ishihara Test situation which is not present in the Nela, or at least does not 
show up in the scores obtained. 

Table 16 presents a comparison of the two tests, item by item. ‘The Ishihara 
items are again plotted along the Y axis and the Nela items along the X axis. 
The data presented here are gathered from the subjects in Group II. By use of 
the chart the question may be answered as to which items were missed on the 
Nela when Plate X on the Ishihara was missed, or vice versa. Despite the fact 
that the Ishihara was constructed with specific plates intended to discriminate 
“red” or “green” defects, no patterns of errors on the two tests are discernible 
as would be expected. The Nela items with high error scores do not appear to 
be correlated with specific Ishihara Plates, but rather the errors are spread over 
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all ten of the significant plates. This would give credence to the hypothesis 
that the functions tested for are not unitary, but are generalized. 

_ Before proceeding to the analysis of the racial, age and smoking habits, a 
brief presentation of the data obtained with the wool sorting test should be con- 
‘sidered since these data are related to the Ishihara as well as the Nela Revision 
test scores. As described earlier, the wool sorting test was administered only to 
those persons making a high error score on the Nela Revision as an additional 
method of validating the revision. 

As would be expected, since both tests require the use of worsteds, all of those 
‘persons making high error scores on the Nela made high error scores on the wool 
‘sorting test. No subject missing more than five Nela Triplets made fewer than 
‘six errors in sorting the yarn samples on the wool sorting test. The number of 


TABLE 16 

Items 
ISHIHARA NELA TRIPLET NUMBER 
PLATE 
pUMSER Pee a) 4 5 16.7 8 4 |, tO. ht 12) 33-114 a5 [1601.17 18) 19) 20.|21"1, 22 | 23. | 24 
11 Pee aot a eS bai. 4 4 lor) Osl ¢ ied | ob io lo Pb) 62/2 1-2. 2 
10 Pea oa oso Oo 4 lo dae) cate | 8 + 4 142 18) 218.4) 213 4) 2 
9 ei eee eA Oo eb Gas io 219 1 4°11) 41018 14) 2) 6 2104 2 
8 er einige wal set el eo elo ool A | a TT i424 2 6) 1 dk ie? 
7 eS eOmaseowOlios ie eo oO loo | ¢ b-6.1-6 | 2-11 16 12-12 |) Pan EL 8 
6 oe em ea 2 wo) 2 oa lis fo 4 1 Sot L241 Lek hy ry 
5 cen eos ce io oOo? 4 438 107} Gl £61 8 3} 279138 125 5 
4 See een eee ei A bh | 6 Le G2 Le 61 1.8 
a Peels ou AO OL on Ao | Ol 4 lobe) Lab 18318 24-7) Se) 1 2 
Z a eo eo dL ee eel eee teh | 22 |) Pods £01 44-0) 0 1.0 
1 oe oso 0 O10 1020) 10 0 1070) 0 1.0 10 1.020 100 | 0 | 07 010) 0 
0 1608/9 18 14 |1 127 \1 if 151 (14 135 (2 (8 |22 |21 44 129 0 6 |1 | O 471 2151 0 


N = 448. Group II. 


errors made was roughly proportional to the number of errors on the Nela, those 
making the higher Nela error scores made comparatively higher error scores on 
the wool sorting test. The range of the errors was from 6 errors to 38, the total 
possible number of errors being 70 if every test item was erroneously classified. 
The mean was 18 errors. Since this test was given to a comparatively small 
number of subjects, and was added only as a check test, detailed results will 
not be presented. 

The most interesting data obtained with the wool sorting test resulted from 
its use as a cross-check on the Ishihara-Nela comparisons. Because of the 
discrepancies noted between the scores on the Nela and the Ishihara, this test 
was used to clarify the resulting confusion. Four groups of subjects tested on 
both the Nela and Ishihara were selected; Group I contained ten persons selected 
at random having zero error scores on both the Ishihara and the Nela Revision. 
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Group II was made up of persons with low error scores on the Nela and high 
error scores on the Ishihara. Group III was made up of ten subjects having 
high error scores on both tests. Group IV was made up of persons having high 
error scores on the Nela and low error scores on the Ishihara. All of the persons 
in these groups were then tested on the wool sorting test (except those in Groups 
III and IV since they had already been tested). Following is a summary of the 
results in terms of errors made on the wool sorting test: 

An inspection of the scores obtained by the members of the four groups on the 
wool sorting test, though admittedly a small number of cases are involved, 
seems to indicate clearly that the Nela can be validated in terms of this test, if 


TABLE 17 

GROUP I (LOW NELA-LOW ISH.) GROUP II (HI NELA-HI se 
Number Score Number Score 

1 1 1 24 

2 0 2 23 

3 0 3 33° 

4 0 4 8 

5 0 5 8 

6 0 6 12 

v 0 7 38 

8 0 8 10 

9 0 9 26 

10 0 (1) 10 9 (191) 
GROUP II (LOW NELA-HI ISH.) GROUP IV (HI NELA-LOW ISH.) 

Number Score Number Score 
3 1 2 1 5 

2 0 Z 4 

3 2 3 ‘ 

4 0 4 8 

5 1 o 6 

6 0 6 1 

7 0 7 8 

8 1 8 5 

9 0 9 2 

10 0 (6) 10 5 (41) 


not the Ishihara. The scores for the groups definitely show that the wool sort- 
ing test and the Nela are testing the same factors, and give a strong suggestion 
that the Ishihara is testing an additional, possibly an extraneous factor. ‘The 
seven errors made on the wool sorting by Groups I and II were all made on two 
of the test items. These two items were the two very light, weak blues. When 
these two items were classified under the wrong color they were placed in the 
“oreen”’ category. These errors, then, may be considered as negligible. ‘There 
are, however, more errors recorded for Group II than for Group I, which would 
indicate that the Ishihara high score has some meaning in terms of the wool 
sorting test. 
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Examination of the errors made by Group III shows clearly that the group 
‘scoring high (making large error scores) on both the Nela and the Ishihara are 
also high error scorers on the wool sorting test. Again, in Group IV, the fact 
that the Ishihara is selecting something not sampled by the Nela seems apparent. 
However, the figures for Group IV should not be taken at face-value since the 
“high”? Nela—low Ishihara group in this case is not in the same high error-score 
class as that of Group III. In considering the data in table 15 it was pointed out 
that no subjects making a truly low Ishihara score made a very high error score 
on the Nela; the highest Nela score paired with a ‘‘zero”’ error score on the 
‘Ishihara was 5 errors, which is not really high when compared with the error 


TABLE 18-A 
Racial comparison 


NUMBER OF ERRORS 


ong. Ge he | om PARTE SAL POH | ras! be | 10 | 13 | 14 
Males 

Per cent of Jewish 
ferieseeen.....clu.t dl 31 Le 2 2 O10 0 0 0 0 

Per cent of non-Jewish 
meee ate oa meg. ES MLR oR 8 op Sop peR LG | EEG 

Females 

Per cent of Jewish 
pene. es. 49 25 16S if 1 1 1 1 OFne"0 0 

Per cent of non-Jewish 
OS ae Aq «|. 25 BLS Sls a ae N ae I 1 i! 1 1 0 0 


““Non-Jewish,’’ 296; ‘‘Jewish,’? 120. Total N, 416. 


scores ranging from 10 to 18 on the Nela, scores of the subjects who make up 
Group III. 

This data is included only as additional validation and to point up the Ishi- 
hara-Nela comparison. It serves the function of a “‘spot-check”’ and no more. 
To give a complete answer to the question of validation, more thorough test 
comparisons must be made, with more valid instruments, on larger groups of 
subjects. This was considered of secondary importance at the time since the 
test was in the process of revision and efforts should be expended in validating 
the final revision, and not each intermediary form of the test during the process 
of developing this final revision. 


FACTOR COMPARISONS 


The material in the following section is presented more for its interest than 
for its value as experimental data. In most cases the number of the cases in- 
cluded under the sub-groups is too small to be of significance in the establish- 
ment of norms. 

Table 18-A presents the data obtained on the two major “‘racial’”’ groups tested 
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on the Nela Revision. Only the “Jewish” and ‘“‘non-Jewish”’ groups contained 
a sufficient number of cases to give a basis for comparison. The data considered 
is taken only from Group II. Itemized data concerning the groups used may 
be obtained by consulting table 2. 

The Table indicates that in the group tested in this experiment, the “Jewish” 
group, both males and females, are superior to the “non-Jewish” group tested 
in terms of errors on the Nela. The ‘Jewish’ group in the case of the males 
and also in the female group have a larger proportion of members making no 
errors on the test. These initial differences between the two “‘racial’’ groups 
are more noticeable in the men than in the women. It would seem also that 
the ‘Jewish’ male group is superior to ‘‘non-Jewish”’ males, ‘“non-Jewish”’ fe- 
males, and “Jewish” females in terms of the total range of error scores. A 
larger percentage of the ‘Jewish’? male group made three errors or less than did 
any other group represented here. There is, also, a significant difference between | 
the “Jewish” and “non-Jewish” male groups at the three-error point. Con- 
sidering the small number of cases included the percentage figures are too simi- 
lar to permit any valid conclusions being drawn. 

It has commonly been assumed that if tobacco has any effect on the function 
of color vision this effect is in a negative direction. The toxins accumulated in 
the body from the carbon, nicotine and other agents in the tobacco, or resulting 
from its use, have been suspected of “‘eausing”’ a variety of dysfunctions, one of 
these being reduced sensitivity to color stimuli. 

In the small group selected for this study it appears that the test scores of 
“regular”? smokers as compared to those of the ‘“‘non-smokers”’ and “‘occasional’’ 
smokers, throw some doubt on the old assumption. In this group, the regular 
smoker groups, starting with more than 50 per cent of their number made no 
errors on the Nela, while in the groups which never smoke, or smoke but occa- 
sionally, the figure is just slightly over 40 per cent for the “non-error”’ group. 
Further, none of the members of the “‘regular-smoker’’ group made more than 
5 errors on the Nela, while the “non-smokers” and “occasional smokers” have 
within their members subjects making as many as 17 errors on the Nela. This 
latter difference may be due to the larger sample included in the “‘non-smoker”’ 
and ‘‘occasional smoker” groups, the larger number raising the probability of 
including more extreme cases. Again, the data presented here are interesting 
and suggestive and would certainly merit additional research. 

The final comparison to be made is on the basis of age groups, the pertinent 
data being presented in tables 19 and 19-A. The only age groups containing 
sufficiently large samples to be of significance are the age groups 18 through 21. 
Above and below these ages, the samples become too small to be of value. 

It will be noted in table 19 that the sex differences get increasingly smaller 
with advancing age from 18 through 20 years of age. In only the 18 year group 
do the men show evidence of superiority in color judgments which appears con- 
sistent. At this age there are more men than women making error scores of 
less than two. But also, in the same age group, there are more men than women 
making error scores of more than 4. Again at age 22 the men appear to be 
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‘superior to the women, but the number here is so small that these differences 
‘are probably more apparent than real. In the ‘25 years and over” age group 
the men stand out as definitely superior to the women in their Nela Test scores, 
‘but the group is composed of only 12 women and 16 men. 

One further comparison of groups was investigated but since negative results 
were obtained, the data will not be included here. The subjects were divided 
‘into two groups on the basis of extent of art training which involved the use of 
‘pigments. The minimum amount of training required for membership in the 
-“art-training”’ group was one year of senior high school art training, or continual 
art training in the home by an artistically inclined parent. One hundred of the 


TABLE 19 


Nela error comparison by age groups 


Per cent error scores by sex and age groups 


ZERO ERRORS 1 ERROR 2 ERRORS 3 ERRORS 4 OR MORE ERRORS 
AGE . N 

Male | Female} Male | Female| Male | Female| Male | Female} Male Female 
16 100 50 0 0 0 50 0 0 0 0 4 
17 50 o2 1 23 12 12 13 6 ies ZL 85 
18 45 43 24 30 15 12 8 8 8 7 243 
19 AP M0 BL 30 25 7 10 10 4 5 10 240 
20 ‘a 52 Vi. 30 eM! if 2 i 18 4 98 
21 36 50 aa 20 15 25 8 5 8 0 56 
Dips 45 Ly 30 33 25 50 0 0 0 0 23 
23 44 | 50 33 50 i 0 EZ 0 0 0 11 
24 50 0 Hy 0 0 100 0 0 38 0 ) 
25 or over 50 tt 20 23 10 0 20 0 0 0 26 

TABLE 19-A 
Average Nela error by age groups 
AGE IN YEARS 

16 i 18 19 20 21 22 23 24 De 
Average male error....| 0 ogeelcomy keol koe 4a. UO Oa 0.0 | ed Oe OS 
Average female error..| 1.0 | 0.8 | 1.1 | 0.9 | 0.9 | 0.95 | Bo PeOLOO 2:0) HOD 


subjects qualified for this category. These students’ scores were compared to 
the scores of the rest of the group (“non-artists’’) and no significant difference 
between the two groups was found, either in terms of total error scores, or, in 
terms of patterns of errors. The result is in keeping with results obtained by 
other investigators interested in comparisons of the same type in other test 
situations (2). 


RELIABILITY 


Though admittedly not a very meaningful measure in the case of a test for 
color-vision deficiency reliability figures were obtained for the test as a whole, 
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and for the individual triplets within the test. The reliability coefficients were 
determined by comparing the first test scores with the retest scores. The 
method actually applied required the determination of the coefficient of contin- 
gency for each test item, translating this figure into a correlation coefficient for’ 
each item and then averaging the twenty-four coefficients to obtain the mean } 
coefficient or reliability for the entire test (12). The reliability figure thus ob- 
tained for the test as a whole was 0.64, higher than would be expected from a 
cursory inspection of the test and retest scores (see table 20). 

It should be remembered that the test construction is such as to actually dis- 
courage consistency of response in the case of the subject with the serious defect, 
unless we can assume that the subject chooses always on the basis of brightness 
when color cues are absent. If the subject is unable to discriminate between 
the colors included in a given item the chances are 50-50 that he will make a 
correct (or incorrect) choice on a given trial, if brightness factors are disregarded. 
In view of this fact a high reliability coefficient would not be expected. 


TABLE 20 
Item reliability by retest method on Nela (as obtained from coefficient of contingency) 
fr ae) 64 (2) 43 (8) SC CS) 24 (5) 29 ~—-(6) 
R = .76 k= .80 KR = .66 R= 237 R = 9 R= .54 
14° (7) .636 (8) 16% 9) 34 (10) .20> (11) 34 (12) 
R= 37 R = .80 R= 8&7 R= 58 R= Abd R = .58 
14 (13) 20 (14) 50 (15) 02 (16) tO Lee) 56 (18) 
R= 37 R= Ad R= .71 R= .72 R= 84 R= .75 
43 (19) 30 (20) 14 (21) .20 (22) 14 .(23) 29 (24) 
R = .66 R= .55 R= 237 R= 45 R= 237 R= .54 
Z(R)2 = 8.84. Zr?/n = .87. Mean r= .61. N (cases) = 100. N (items) = 24. a/ 37° 
= 61. P.E. = +.05. 
REMEDIAL WORK 


One of the most interesting and tempting aspects of the entire problem of 
color-vision is that of possibilities of remedial work to be done with subjects, 
in whatever range of defect. As an exploratory experiment, prompted by recent 
work done with vitamins in connection with ‘‘glare-blindness,”’ one student with 
a particularly high error score on the Nela, Ishihara and Holmgren type tests 
offered to serve as a “guinea pig” to test the efficacy of Vitamin A as a reme- 
dial agent. 

The subject was a man, 19, of the ‘‘White”’ group who did not smoke. His 
score in the original test series was 18 errors on the Nela Revision, 10 errors on 
the. Ishihara and 42 errors on the Holmgren; consistently high throughout the 
series. The subject was required to take 25,000 units per day of Vitamin A 
for 25 days, after which he was retested on the entire series of tests. His retest 
scores, after taking the prescribed amount of Vitamin A were as follows: 6 errors 
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on the Nela, a drop of 12 errors; 4 errors on the Ishihara, a drop of 6 errors; and 
8 errors on the Holmgren-type test, a drop of 34 errors. In terms of the total 
decrease in error score the change is indeed startling, certainly such a degree of 
change was not anticipated. It should be remembered, however, that this is 
only one case and there is a large probability that other factors might have been 
responsible for the score decrease. The subject reported that at least to him, 
‘subjectively, the Ishihara Test Plates had an entirely different appearance, 
patterns and figures standing out clearly, where he had seen no pattern at all 
previously. 
Encouraged by the results in the single case more subjects were picked out 
for a controlled experiment to test the remedial properties of Vitamin A. All 
sixteen of the subjects used in this experiment were subjects making five or more 
errors; error scores for the group ranged from 5 errors to 22 errors. This group 
was divided into two matched groups of eight. Group A was the experimental 


TABLE 21 


Effect of vitamin A on Nela error scores 


GROUP “A” GROUP ‘‘B”’ 
Sub. no. Error score After Difference Sub. No. | Error score After Difference 
1 20 12 —§ 1 Ze PA —]l 
2 bG 10 —5 2 17 if 0 
3 14 4 —10 3 13 12 —1 
4 14 7 —7 4 9 7 —2 
5 8 2 —6 a ‘i 8 1 
6 a 0 —7 6 7 6 —l 
rc 5 3 —2 7 5 4 —]l 
8 5 0 —5 8 4 4 0 
30) 88 38 50 84 42s 5 
Average drop = 6.0 Average drop = 0.6 


Group A = vitamin ‘‘A’’ group. Group B = milk-sugar group. 


group and Group B the control group. An effort was made to pair the members 
of the two groups in such a way that each member of Group A would have a 
matching member in Group B with approximately the same error score. The 
members of Group A were each given 12 Vitamin A capsules (Squibbs “Pure 
Vitamin A’’). Each capsule contained 25,000 units of the vitamin and the sub- 
jects were instructed to take one capsule each day for the following 12 days. 
The members of Group B were given similar capsules, and were told that the 
capsule contained Vitamin B, though the capsules were only a milk-sugar mix- 
ture. No member of Group B knew that the capsules he was taking regularly 
did not contain Vitamin B. Both groups were tested just before the vitamin 
program was begun, and retested 14 days later after they had taken the pre- 
scribed capsules. Every member of both groups had been put through the tests 
twice previously as a member of the general experimental group. The results 
are presented in table 21. 
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Every member of Group A made a significantly lower error score following 
the vitamin dosage. Subject 7 changed least of all but his score was not high 
at the start. Subject 3 shows the largest score change of the group. The total 
error score for the group before the dosage was 88 errors, the average 11. Fol- 
lowing the dosage the total score was 38, the average 4.5. The drop in error 
score was 50 for the group; the average drop 6 errors. The scores for the group 
dropped more than 50 per cent following the vitamin dosage. This score change 
is a significant change according to Fisher’s formula for small samples. 

The scores for the control group, Group B, are quite different. This group 
received no vitamin dosage. The total error score for the group at the start of 
the experimental period was 84, or an average of nearly 11 errors per subject. 
Following the period the score was 79 for the group, a drop of only 5, an average 
drop of less than one error for the group, as contrasted with the average of 6 
errors dropped by Group A. Considered even ih terms of average scores the 
differences between the two groups following the dosage are highly significant. 
The differences, for the most part, are accentuated when the individual pairs 
in the two groups are compared. 

This experiment was conducted only as a preliminary exploratory experiment 
in an effort to isolate some of the remedial factors which might be used in raising 
the efficiency of those persons exhibiting defective color vision. In terms of 
this experimental group the results are highly suggestive and certainly should 
be followed with more complete investigation. The dosage period was admit- 
tedly very short and it is probable that greater changes would result from a 
longer period of dosage. Possibly other vitamins can be used even more effec- 
tively, particularly vitamin B. Further experimental work is now in progress. 


SUMMARY 


A revision of the Nela Test for Color Vision was devised composed of twenty- 
four of the “most discriminatory” test items out of the original forty-seven. 
The items to be included in the revision were selected on the basis of the analysis 
of the Nela Test conducted by Scheidt (44) in 1936, supplemented by an analy- 
sis of results obtained in the administration of the test to four hundred subjects. 

The revised form of the test was administered to 797 college subjects, students 
in elementary Psychology classes at the University of California at Los Angeles. 
The subjects were divided into two groups: Group I subjects were given the 
test and subsequently the retest over the same set of test items with a minimum 
period of one week between the test and retest occasions; Group II subjects 
were given the test and retest consecutively, in the same test period, a period 
of fifteen minutes being the maximum elapsed time between test and retest. 

To facilitate standardization and comparison an eleven plate edition of the 
Ishihara Test for Color-Blindness was administered at the same time to each 
subject, and also a wool sorting test, of the Holmgren type, was administered 
to certain error groups. 

Each subject was required to indicate his status relative to certain other 
These additional factors noted 
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were age, sex, smoking habits, ‘racial background”? and amount of artistic 
‘training. The subjects were sub-divided into groups under these headings and 
these sub-groups were compared as to performance on the color-vision test. 

Tentative validation was attempted in terms of the scores obtained on the 
Ishihara and the Holmgren-type tests. Reliability figures were obtained in 
terms of the comparison of the test and retest scores. Comparisons were made 
between the various groups, such as sex, race, age and smoking habits, to clarify 
the function of the various factors relative to color-vision deficiencies. 

Total error scores, and scores on the individual items in the test, were analyzed 
in order to determine whether or not further revision of the test was feasible. 


CONCLUSIONS 


_ 1. Certain items in the test were found to differentiate to a small degree be- 
tween males and females tested. The discriminatory items for the women are: 
5-14-15-16-. The discriminatory items for the men are: 8-21. 

2. In terms of the groups tested, it was found that there are differences be- 
tween male and female subjects in terms of the percentage of ‘‘color-weak”’ sub- 
jects found in each sex, but that this difference does not appear to be nearly as 
large or as significant as past experimenters have reported. At the arbitrarily 
drawn ‘‘color-deficient”’ line (four errors on the Nela revision), the per cent- 
error score for men was 3.0 per cent and for women was 1.5 per cent. 

3. At either extreme of the error score range the two sex groups tend to include 
approximately equal percentages of cases, and thus, sex differences disappear 
at these points. Relatively equivalent percentages of men and women made 
no errors, and likewise, the relative per cent of the men and women missing 
more than ten items was approximately the same. 

4. Application of the Chi-Square test to the data relative to sex differences 
indicates that the differences, while suggestively large, are not entirely significant 
in terms of the criterion demanded for significance. The Chi-Square value ob- 
tained indicates that the differences noted could have been obtained by chance 
only about 15 times in one hundred. The Chi-Square test further indicated 
that the most important differences between the two sex groups are to be found 
at the level of four errors, or more than four errors. 

5. Analysis of age-group comparisons reveals no significant changes in color- 
vision efficiency related to the age factor, as such. The color-vision differences 
between the two sexes noted above are not accentuated in any of the age groups 
studied in this experiment. There is an indication that extreme scores may be 
affecting the male scores at ages 18-21, serving to equalize apparent differences. 
The color-vision factor, further, seems to be constant for the age period included, 
no increase or decrease being apparent with increasing age in either sex. 

6. Comparison of the Nela Revision with the Ishihara Test reveals discrepan- 

cies in error scores which can probably be explained in terms of intensity and 
acuity factors operating in the Ishihara Test. Since such factors are not prop- 
erly included in a color-vision test, the discrepancies would seem to favor the 
Nela Test. The data obtained from the comparison indicates further that the 
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Nela is considerably more diagnostic in the important “‘color-weak”’ group (the: 
low error group), than is the Ishihara, since the Nela permits a few errors, while: 
the Ishihara is an ‘‘all-or-none” type test in that one or two errors on the test 
are purportedly diagnostic of ‘‘color-blindness.”’ This feature of the Nela which 
allows a range of error for the ‘‘color-weak” subject is a very important and 
useful feature of the test. 

7. The differences between color-vision “‘deficiency”’ in men and in women are 
minimized in the Nela results and magnified in the Ishihara results. 

8. In terms of the scores made on the Holmgren-type wool sorting test, as 
compared to the scores obtained with the Nela revision, the validity of the Nela 
is substantiated, the validity of the Ishihara is questioned. Comparison of the 
test scores on the Ishihara and the wool sorting test indicates that additional, 
and possibly extraneous, factors are operating in the Ishihara Test. 

9. The reliability coefficient obtained by comparison of a sample of the test 
and retest scores for the Nela Revision is .64. Apparently low, this figure may 
be interpreted as being actually quite high since the test concerned is a test of 
color-vision deficiency. Since a test of this type is actually forcing the subject 
with gross color-vision defects to make a choice, despite the fact such subjects 
have no basis for making a choice, a truly high reliability coefficient is not to be 
expected. Despite this fact the reliability of the test will probably be raised 
considerably by the elimination of certain of the less discriminatory items and 
the resulting shortened form of the test. 

10. Item analysis of the test reveals certain diagnostic items, and combinations 
of items, which may be used in diagnosis of the individual case. Constellations 
of errors were found to occur significantly in groups of two, three and four items. 
The test items found in the constellations of error were found to be in almost 
every case items which were missed frequently by those subjects making only 
one error. | 

11. Item analysis revealed that thirteen of the twenty-four test items were 
responsible for 86 per cent of the total error, and that these thirteen items were 
always involved in errors made by 99.5 per cent of the subjects making errors. 
In other words, eleven of the items were found to be of such low discrimination 
value as to be practically worthless for the purpose of group testing. It was 
suggested that these thirteen items be used in a new revision of the test in order 
to test further their discriminatory value. These items were triplets number 
1, 2, 5, 8, 9, 10, 138, 14, 15, 16, 18, 19, and 21. The original triplet numbers of 
these same items, in the original Nela Test, are: In Test I: 15, 3, 2, 14, 18, 22, 
16 and 6; in Test II: 11, 7,13, 6 and 22. This suggested revision, if found to be 
as discriminatory as would appear likely from this data, would increase the 
practicability of the test and reduce considerably the present problem of ad- 
ministration. | 

12. Comparison of the “Jewish” and “non-Jewish” groups revealed slight 
differences in the two so-called ‘racial’? groups favoring the “Jewish” group. 
That is, as a group, the ‘Jews’ tended to make lower error scores on the Nela 
revision than did the ‘‘non-Jews.”’ Further, within these two “racial’’ groups 
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the tendency for females to display superior color-vision discrimination does not 
appear. ‘The score for the ‘Jewish’ male group was relatively lower than that 
of either the “non-Jewish” females or the “Jewish” females, as well as being 
lower than the score for the ‘‘non-Jewish”’ males. 

13. An intercomparison of the scores for the sub-groups under ‘Smoking 
Habits,” though admittedly a small group, reveals that smoking would seem to 
lower the probability of error on the Nela Revision. Those persons classed as 
“regular”? smokers, whether heavy or light, made relatively fewer errors than 
did the ‘‘non-smoking” or “‘occasional-smoker”’ groups. The larger percentage 
of “smokers’’ falling in the no-error class, as compared to the rest of the groups, 
would indicate a higher sensitivity to color stimulation in the “smoker” group. 
14. There was no significant difference in the relative error scores of the “‘art- 
ist”’ group, and the “non-artist”’ group. The “artist”? group made more errors 
at the one or two error levels, but the ‘‘non-artists’”’ had more members making 
relatively large error scores. 

15. Exploratory experimentation in remedial work with subjects exhibiting 
color-weaknesses indicate a high probability that such a project might have 
beneficient results. One subject given a series of Vitamin A dosages responded 
with surprising degree of change in color sensitivity. The subject made an error 
score of 18 on the Nela, and 42 on the Holmgren before the vitamin dosage was 
begun. After taking 25,000 units of Vitamin A for 25 days the error score on 
the Nela was reduced to 6 and on the wool sorting test to 8. 

16. Controlled experimentation using sixteen of the ‘‘color-weak” group to 
further check on the efficacy of Vitamin A as a remedial agent gave highly sig- 
nificant results for the groups tested. The subjects receiving Vitamin A dosage 
made more than 50 per cent fewer errors following the dosage. The average 
error score drop was 6 errors per subject following vitamin dosage, and only 
0.6 of one error for the matched control group. Further research is indicated. 

17. Results would indicate that the range of error scores from four to ten 
should be designated as the “‘color-weak”’ range. Subjects making more than 
ten errors on the revision may be considered as having definite color-vision 
deficiencies. 
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